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Abstract —The users at cell edge of a massive multiple-input 
multiple-output (MIMO) system suffer from severe pilot contam¬ 
ination, which leads to poor quality of service (QoS). In order 
to enhance the QoS for these edge users, soft pilot reuse (SPR) 
combined with multi-cell block diagonalization (MBD) precoding 
are proposed. Specifically, the users are divided into two groups 
according to their large-scale fading coefficients, referred to as 
the center users, who only suffer from modest pilot contamination 
and the edge users, who suffer from severe pilot contamination. 
Based on this distinction, the SPR scheme is proposed for 
improving the QoS for the edge users, whereby a cell-center 
pilot group is reused for all cell-center users in all cells, while a 
cell-edge pilot group is applied for the edge users in the adjacent 
cells. By extending the classical block diagonalization precoding 
to a multi-cell scenario, the MBD precoding scheme projects the 
downlink transmit signal onto the null space of the subspace 
spanned by the inter-cell channels of the edge users in adjacent 
cells. Thus, the inter-cell interference contaminating the edge 
users’ signals in the adjacent cells can be efficiently mitigated and 
hence the QoS of these edge users can be further enhanced. Our 
theoretical analysis and simulation results demonstrate that both 
the uplink and downlink rates of the edge users are significantly 
improved, albeit at the cost of the slightly decreased rate of center 
users. 

Index Terms —Massive multiple-input multiple-output system, 
pilot contamination, inter-cell interference, quality of service, soft 
pilot reuse, multi-cell block diagonalization precoding 


I. Introduction 

In an effort to meet the escalating demand for increasingly 
higher-capacity and improved-reliability wireless systems, the 
‘massive’ or large-scale multiple-input multiple-output (LS- 
MIMO) concept has been proposed m- a, where typically 
each base station (BS) is equipped with a large number of 
antenna elements (AEs) to serve a much smaller number of 
single-AE users. This way each user may have access to 
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several AEs. This large-scale MIMO technology offers sev¬ 
eral significant advantages in comparison to the conventional 
MIMO concept having a moderate number of AEs. Firstly, 
asymptotic analysis based on random matrix theory (2| demon¬ 
strates that both the intra-cell interference and the uncorrelated 
noise effects can be efficiently mitigated, as the number of AEs 
tends to infinity. Furthermore, the energy consumption of cel¬ 
lular BSs can be substantially reduced (H and the LS-MIMO 
systems are robust, since the failure of one or a few of the 
AEs and radio frequency (RF) chains would not appreciably 
affect the resultant system performance (I) . Additionally, low- 
complexity signal-processing relying on matched-filter (MF) 
based transmit precoding (TPC) and detection can be used to 
for approaching the optimal performance, when the number of 
AEs at the BS tends to infinity 0. 

Similar to conventional MIMO systems, knowledge of the 
channel state information (CSI) is also required at the BS of 
LS-MIMO systems, namely for data detection in the uplink 
(UL) and for multi-user TPC in the downlink (DL) 0. l5l. In 
the time-division duplexing (TDD) protocol, the BS estimates 
the UL channels and obtains the DL CSI by exploiting the 
channel’s reciprocity 12, 0, 0- However, this approach 
suffers from the so-called pilot contamination (PC) problem 
0-0 in multi-cell multi-user scenarios due to the reuse of 
the pilot sequences in adjacent cells, which imposes grave in¬ 
terference on the channel estimate at the BS. Furthermore, the 
commonly used MF and zero-forcing (ZF) TPC schemes will 
impose inter-cell interference (ICI) on the DL transmission, 
which cannot be reduced by increasing the number of AEs at 
the BS. 

Hence, the problem of ICI and PC has been exten¬ 
sively studied 0-en. The fractional frequency reuse (FFR) 
scheme q, m adopted in LTE Release 9 aims for mitigating 
the ICI by assigning orthogonal frequency bands to edge 
users in the adjacent cells at the cost of additional spectral 
resources. The original frequency-division duplexing (FDD) 
based coordinated multi-point (CoMP) transmission of LTE-A 
Release 11 a is able to avoid the ICI between adjacent cells, 
whereby each user estimates and feeds back the quantized DL 
channel from all adjacent cells to the corresponding BS, and 
then the BS distributes the CSI to adjacent cells. However, this 
kind of FDD based CoMP technique would not be feasible for 
massive MIMO since the CSI feedback overhead would be 
huge as the number of BS antennas increasing flOl . Using 
time-shifted pilot sequences for asynchronous transmission 
among the adjacent cells cd, m partially mitigates this 
problem, but it leads to mutual interference between data 
transmission and pilot transmission. A TPC scheme can be 
used for mitigating the ICI with the aid of joint multi-cell 
processing CD, d, but again, imposes a high information 
exchange overhead. The authors of m imposed specific 
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conditions on the channel’s covariance matrix, which is only 
valid for the asymptotic case of infinitely many AEs at the 
BS. The angle-of-arrival (AOA) based methods of ifTFl . fTTfl 
exploit the fact that the users having mutually non-overlapping 
AOAs hardly contaminate each other even if they use the 
same pilot sequence, but naturally, the efficiency of these 
methods relies on the assumption that the AOA spread of each 
user is small, which is not always the case under realistic 
channel conditions. A data-aided channel estimation scheme 
was proposed in ED, whereby partially decoded data is 
used for estimating the channel and the PC effects can be 
beneficially reduced by iterative processing at the cost of an 
increased computational complexity. Additionally, the blind 
method of ED, f20l based on subspace partitioning is capable 
of reducing the ICI under the assumption that the channel 
vectors of different users are orthogonal, which is not often 
the case in practice. The scheme proposed in I2T1 is capable 
of eliminating PC all together, but this is achieved with the 
aid of a complex DL and UL training procedure. Note that 
all these existing contributions treat all users in the same way, 
as though they suffer from the same PC, but in reality the 
severity of PC varies among the users. 

Against the above background, inspired by the FFR 
scheme 0 adopted in FTE Release 9, we propose a soft 
pilot reuse (SPR) scheme for mitigating the PC of ES-MIMO 
systems, whereby a cell-edge pilot group is applied for the 
cell-edge users in adjacent cells, while the cell-center users 
reuse the same center pilot group in all cells. Furthermore, by 
extending the classical block diagonalization (BD) precoding 
El to a multi-cell scenario, a multi-cell block diagonalization 
(MBD) TPC techniques is conceived for mitigating the ICI and 
for enhancing the quality of service (QoS) for the edge users. 
Specifically, the contributions of this paper are summarized as 
follows. 

• We break away from the traditional practice of treating 
the PC for all users identically - instead, we divide 
the users into two different groups to be considered 
separately, namely, center users subjected to a slight PC 
and the edge users suffering from more severe PC. In this 
way, the center users can benefit directly from the LS- 
MIMO technology and the efforts can be directed towards 
improving the QoS for the edge users. 

• In contrast to the FFR scheme, which assigns orthogonal 
frequency bands to the edge users in adjacent cells, the 
proposed SPR scheme divides the pilot types into two 
groups within the same frequency band, i.e. in a center 
pilot group, which is reused for the center users in all 
cells and in an edge pilot group, which is applied for 
the edge users in adjacent cells. Thus, for the edge 
users, the accuracy of the channel estimation is improved 
and the UF achievable rate is increased. Moreover, by 
using slightly more pilot resources for edge users, the 
BS becomes capable of estimating not only the intra-cell 
channels of the users within the reference cell, but also 
the knowledge of the ‘inter-cell channels’ of the edge 
users in the adjacent cells. 

• Different from the original CoMP technique has to obtain 
the inter-cell channels by consuming large overhead 0, 
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Fig. 1. Illustration of multi-user multi-cell LS-MIMO system. 

GE the proposed MBD precoding can directly exploit 
the partial knowledge of the ‘inter-cell channels’ and 
is capable of suppressing the ICI imposed on the edge 
users of the adjacent cells. Specifically, by extending the 
classical BD TPC to a multi-cell scenario, the MBD TPC 
projects the DF transmit signal onto the null space of 
the subspace spanned by the partially known ‘inter-cell 
channels’. Thus, the ICI imposed on the edge users of 
the adjacent cells can be substantially mitigated, hence 
the QoS of the edge users is significantly enhanced. 

• In order to analyze the performance of our proposal, 
we compare the associated pilot requirements, derive the 
attainable average UF as well as DF rate and characterize 
the computational complexity imposed. Our theoretical 
derivation confirms that both the achievable UF and DF 
rate of the edge users is significantly improved at the 
cost of requiring slightly more pilots. Moreover, our 
simulation results show that the average UF and DF cell 
throughput in the SPR and MBD aided system is able 
to approach and even exceed that of the conventional 
system, provided that a modestly increased number of 
BS AEs is affordable. 

The rest of the paper is organized as follows. In Section [III 
we briefly review the multi-cell FS-MIMO system model, 
while Section uni is devoted to detailing the PC, which is the 
main performance-limiting factor of FS-MIMO systems. Sec¬ 
tion [TV] further details the motivation of this paper, while the 
proposed SPR scheme and MBD precoding are discussed in 
Section[Vl SectionlYllprovides our performance analysis of the 
proposed SPR scheme and MBD precoding. Our simulation 
results quantifying the benefits of our proposals are presented 
in Section IVII1 while our conclusions follow in Section IVII1 

Throughout our discussions, boldface lower and upper¬ 
case symbols represent vectors and matrices, respectively. The 
transpose, conjugate, and Hermitian transpose operators are 
given by (*) T , (•)*, and (-) H , respectively. The Moore-Penrose 
pseudo inverse operator is denoted by (-)t and the trace 
operator is represented by Tr(-), while diagjai, a2, • • • , a m } 
denotes the diagonal matrix associated with ai, c&2, • • • , a m at 
its diagonal entries and the M x M identity matrix is given by 
Im- The number of elements in a set is denoted by card{-}, 
and the l p norm is denoted by || • || p , while the expectation 
operator is given by E{-}. 

II. System Model 

A multi-cell multi-user FS-MIMO system is illustrated in 
Fig. [U which is composed of L hexagonal cells, each having 
a central BS associated with M antennas to serve K (K <C 









3 


Uplink Data 
BS Processing 


Uplink Pilot 
Downlink Data 


Fig. 2. Multi-user multi-cell MIMO TDD protocol. 


M) single-antenna users (U, O. The channel vector hG 
C Mxl of the link spanning from the k -th user of the j -th cell 
to the BS of the i-th cell can be formulated as 


— &i,j,k \J ( 1 ) 

The small-scale fading vectors gij : k G C Mxl are statistically 
independent for the K users and they obey the complex¬ 
valued Gaussian distribution having a zero-mean vector and 
a covariance matrix Im, hence we have gij,k ~ CA/*(0,1 m). 
Still referring to Eq.Q] the large-scale fading coefficients 
are the same for the different antennas at the same BS, but 
they are user-dependent. Moreover, they are related to both the 
pathloss and shadow fading, which will be addressed in detail 
in the context of our simulations. Thus, the channel matrix of 
all the K users in the j -th cell and the BS in the i -th cell can 
be represented by 


H 




gi,j,2 • • • g, 


( 2 ) 


where D* j = diag(/3*j ? i, Aj ?2 , ■ • ■ , Pip,K) denotes the large- 
scale fading matrix relating all the K users in the j -th cell to 
the i-th cell’s BS. 

By adopting the TDD protocol, the BS obtains the DL 
channel estimate by exploiting the reciprocity of the UL and 
DL channels. More specifically, both the small-scale fading 
vectors and the large-scale fading coefficients may be deemed 
to be equal for both the DL and UL directions, provided 
that the bandwidth is sufficiently narrow for avoiding the 
independent fading of the DL and UL. 

Before considering the PC phenomenon, we summarize 
the asymptotic orthogonality on random vector m. Let 
x, y G C Mxl be two independent vectors with distribution 
CAT(0, cIm)- Then from the law of large numbers, we have 


lim 

M—»• oo 


X^X 


M 


and 


lim 

M—>■ oo 


x^y 

M 


0, 


( 3 ) 


where denotes the almost sure convergence. 


III. Pilot Contamination 

By considering the TDD protocol, we adopt the widely 
used block-fading channel model, whereby the channel vectors 
h ij^k remain constant during the channel’s coherence interval. 
As shown in Pig. O each coherence interval is comprised 
of four stages for each user lH2l : 1) UL data transmission; 
2) UL pilot transmission; 3) BS processing; and 4) DL data 
transmission. 

At the first stage, all users in all cells synchronously 
send UL data to their corresponding BSs and the user data 


received at the BS in the i- th cell can be represented as 
y” = VP^Y^=iY.k=iKj,kx\ k + n“, where with 

E{|*" fe | 2 } = 1 denotes the symbol transmitted from the k- th 
user roaming in the j-th cell, p u represents the UL data trans¬ 
mission power and G C Mxl denotes the corresponding 
UL channel’s additive Gaussian white noise (AWGN) vector 
associated with E{ n^(n^)^} = (ct^) 2 Im- 

For a typical LS-MIMO system, the pilot sequences used 
within a specific cell are orthogonal, but the same pilot group 
is typically reused in the adjacent cells due to the limited 
number of orthogonal pilot sequences. Thus, during the second 
stage, the matrix of pilot sequences received at the BS of the 
i-th cell, which is denoted by Y? G C Mxr , can be represented 
as Y^ = Hi,+ N?, where the matrix <E> = 

[0i 02 '''4>k\ T G C Kxt containing the transmitted pilot 
sequence satisfies <$><$> H = I K , p p is the transmission power of 
the pilots and N? G C Mxr denotes the UL channel’s AWGN 
matrix. 

During the third stage, the BS of the i-th cell obtains an 
estimate of the channel matrix using any conventional 
channel estimation method by directly correlating the received 
pilot matrix with the local pilot matrix, yielding 

H»,i = -Ly = H„ + V H id + . ( 4 ) 


It can readily be seen that the channel estimate of the k -th 
user in the i-th cell, namely is a linear combination of 

the channels for i <j< L , which include the channels 
of the users in the other cells associated with the same pilot 
sequence. This phenomenon is referred to as PC Q-0. Given 
the estimated channel matrix and by adopting the low- 
complexity MF detector, the detected symbol arriving from 
the k -th user in the i-th cell can be represented as 


2?,* =K, k y ? 

= VP~u Ki,kKi,kX\ k + Ki,kKj,kX\ k 


+ ei 


i,k 


j/* 


(a) 


| Pi,i,k x i^k T Pi,j,k x j,k J ^ 


(5) 


j#* 


where k denotes the k -th column of , s" , repre- 

sents the interference, which can be reduced to an arbitrarily 
low level by increasing the number of transmit antennas M at 

(a) 

the BS, and « indicates that the approximation holds by in¬ 
voking the asymptotic orthogonality associated with M -A oo. 
Thus, the UL signal to interference plus noise ratio (SINR) of 
the k -th user in the i-th cell can be calculated as 


SINR^ = 


l h M.fc h <.<,fc| 2 


E^ \ h Z, k Kj,k\ 


£■ 


7a, 


(°) A 


i,i,k 


E 


B 2 
j/* 


( 6 ) 


and the achievable UL rate can be expressed as C- k = (1 — 
/i)E{log 2 (l + SINR? fe )}, where 0 < p < 1 evaluates the 
spectral efficiency reduction caused by the pilot transmission 
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l24l . It is clear the UL achievable rate remains limited by the 
PC and it cannot be increased by simply assigning an increased 
transmission power and/or pilot power, i.e., by increasing p u 
and/or p v . 

The PC affects the DL transmission during the fourth 
stage as well. The normalized MF precoding matrix 0 
is commonly used for the DL transmission, which can be 
represented by Wj — —=HA, where 7 i = Tr(H^H * { )/K 
is a normalization factor. The BS in the 7 -th cell transmits 
an M -dimensional signal vector as sj = W^x^, where 

4 = [4,i x i, 2 "' x i,K] T with E{| 4 ,/c| 2 } = 1 denotes 
the source symbol vector for the K users in the 7 -th cell. 
The received signals of the K users in the 7 -th cell can be 
collected together as yf = s/jh\ Ej=i + n i> 

where n? denotes the DL channel AWGN vector associated 
with } = (4) Im- Similar to the derivation seen 

in ©, the DL SINR of the k -th user in the 7 -th cell can be 
derived as 


sinr7 = 


\h-7i ,k\ 2 


(«) Pli,k 


j,i,k^j,i, k \ 2 + \ £ ' 


i.k \ 


7a. 


E 


B 2 

j^i ^j,i,k 


(7) 


where e\ k denotes the corresponding interference similar to 
4 k given in ©. The corresponding DL rate can be repre- 
seated as C? fc = (1 - M )E{ log 2 (l + SINR^)}. 

In summary, the PC caused by the reuse of the same 
orthogonal pilot group in adjacent cells cannot be reduced by 
increasing the number of antennas at the BS, hence it limits 
the achievable performance of multi-cell multi-user LS-MIMO 
systems. 


IV. Motivation of Our Proposal 


In the existing state-of-the-art solutions 0, f 8 ), |HT1 - ||2T1 . 
which aim for reducing the PC, all users are treated identically. 
However, according to © and ©, it becomes clear that 
the attainable SINR is proportional to the large-scale fading 
coefficients /3f ik , which are different for the K users of each 
cell. Thus, we have to break away from this traditional concept 
of treating the PC for all users identically, which this motivates 
our idea of dividing the users of each cell into two groups, 
namely the group of center users subjected to modest PC and 
the group of edge users suffering from severe PC. We will 
treat them differently. 

In fact, the limit of the UL SINR of the k- th user in the 
7-th cell, which is defined by 


hi,k — 




2 

i,i,k 


E 






2 ’ 

i,j,k 


( 8 ) 


specifies the severity of the PC for this user. Therefore, it is 
easy to sort the users in a cell according to their SINR values 
rji^ki if all the large-scale fading coefficients {(3fj k } are known 
at the BS, which is a key assumption stipulated in the state-of- 
the-art contributions 0, ED, EH. However, in practice, it is 
difficult for the BS to obtain an accurate estimate of the large- 
scale fading coefficients of the users in other cells, i.e., of ■ k 



for j 7, unless BS-cooperation is invoked, which is typically 
associated with a substantial side-information overhead. 

Since we have 77 ^ oc (3f ik , we may also use (3f i k for 
estimating the severity of the PC for the k -th user roaming in 
the 7 -th cell. In contrast to /?? ■ k for j ^ i, all the large-scale 
fading coefficients {/3f ik } of the K users in the 7 -th cell can 
be readily obtained. Thus, the K users in the 7 -th cell can be 
readily divided into two groups according to 


J Yes —>• center users, 
No —>> edge users. 


The user-grouping threshold pi can be set to 


(9) 


= ( 10 ) 

k=l 

where A can be adjusted according to the specific system 
configuration. A simple case is illustrated in Fig. [3J where 
according to the large-scale fading coefficients {/3f ik } and 
the given threshold pi , the users are divided into two groups, 
namely the center users associated with only a slight PC and 
the edge users subjected to severe PC. Note that the threshold 
pi is not based on the geographic locations of the users - it is 
rather based on the signal space of {/3f ik }. 

Since the center users only suffer from minor PC, the con¬ 
ventional LS-MIMO scheme outlined in the previous section 
is capable of attaining a high performance. By contrast, the 
edge users suffer from serious pilot contamination, hence their 
performance based on the conventional LS-MIMO scheme 
is expected to be poor. In order to enhance the QoS of 
the edge users, who suffer from heavy PC, we propose the 
more sophisticated SPR scheme and MBD precoding in the 
following section. 


V. The Proposed Soft Pilot-Reuse Scheme and 
Multi-cell Block Diagonalization Precoding 

Based on the division of users into two groups as outlined 
in the previous section, it is plausible that the center users 
indeed benefit from the conventional LS-MIMO technique. 
By contrast, improved measures have to be considered for 
enhancing the QoS of the edge users, such as our SPR 
and MBD schemes, which will be discussed in detail in the 
following three subsections: 1) the proposed SPR scheme; 
2) channel estimation based on SPR; and 3) the MBD pre¬ 
coding advocated. 


A. Proposed Soft Pilot Reuse Scheme 

Inspired by the FFR scheme, which assigns orthogonal 
frequency bands to edge users in adjacent cells to prevent 
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serious ICI in 3GPP LTE Release 9, we propose the SRP 
scheme to mitigate the PC, whereby orthogonal pilot sub¬ 
groups are assigned to the edge users in the adjacent cells, 
while a center pilot group is reused for the center users of all 
cells. 

More specifically, consider a typical LS-MIMO system, 
which is composed of L hexagonal cells, where the i-th cell 
supports Ki users. In the conventional LS-MIMO scheme ATI— 
01 , the number of orthogonal pilot sequences required can be 
calculated as 

K cs = max{Ki, i = 1,2, • • • , L}. (11) 

In contrast to the conventional LS-MIMO scheme, where all 
users are treated identically, the Ki users of the i-th cell are 
firstly divided into two groups according to their large-scale 
fading coefficients {/?? i k }, which have cardinalities of 

Ki = K itC + Ki^ (12) 

where Ay c = cardjfc : /3f ik > pi} denotes the number of 
center users, while Ki ?e = card {k : /3f ik < pi} represents 
the number of edge users. Thus, the number of orthogonal 
pilot sequences needed in the proposed SPR scheme can be 
calculated as 

^SPR = ^c + ^e, (13) 

where K c = max{if^ c , i = 1, 2, • • • , L} denotes the number 
of pilot sequences assigned to the center users, while K Q = 
J2 i= i Ki 5 e denotes the number of pilot sequences dedicated 
to the edge users. It should be pointed out that we assume 
having L cooperating cells, thus L is a moderate value. Lor 
example, we have L =7 for the classic seven-cell system. Then 
the entire set of pilot resources $spr E C KsprXt associated 
with ^spr^spr = I^spr can b e divided into 

$SPR=[* c T Cf, (14) 

where <I> C E C XcXT is reused for the center users in all cells 
and <3>e E C K&Xr is applied to the edge users of the adjacent 
cells. Lurthermore, 3> e can be divided into L partitions, as 

*'=[*Zi*$2-*Zl] T , ( 15 ) 

where E C Ki eXr is applied to the Ki^ edge users in the 
i-th cell. Thus, the pilot sequences applied to edge users are 
orthogonal to those of the other users roaming in the adjacent 
cells. 

In the example of Pig. @1 there are 3 hexagonal cells 
associated with K\ = 4, = 5, and K% = 6 users. In order 

to completely eliminate the PC would require 15 orthogonal 
pilot sequences. It can be readily calculated that we have 
Kqs = 6 and iT s pr = 8 for this simple case. Although the 
proposed SPR scheme requires slightly more pilot resources 
than the conventional scheme, the QoS of the edge users can be 
significantly improved, which will be verified in the following 
subsections. 



Fig. 4. An example of the proposed SPR scheme. 

B. Channel Estimation Based on Soft Pilot Reuse 

By applying the proposed SPR scheme, the BS becomes 
capable of estimating the channels for its edge users in the 
absence of PC, since the pilot sequences assigned to the edge 
users are all orthogonal. Moreover, the BS can also obtain 
the partial knowledge of the inter-cell channels of the edge 
users in adjacent cells, which is the dominant source of the 
ICI inflicted upon these edge users of the adjacent cells during 
the BS’s DL transmissions. 

Specifically, we consider the same LS-MIMO system as 
in Subsection IV-A1 which is composed of L hexagonal cells, 
where the i-th cell has K z users. Based on the proposed SPR 
scheme, we divide the channel matrix H ij as defined in (0 
into two parts 

( 16 ) 

where LL • E C MxKj > c denotes the channel matrix of the link 
spanning from the center users in the j-th cell to the BS in the 
i-th cell, while LL • E C MxKj ’ e denotes the channel matrix 
of the link spanning from the edge users in the j -th cell to the 
BS in the i-th cell. Then, the pilot sequence received at the 
BS of the i-th cell can be represented by 

^ =Vp~v (E ■ K ^) + E H L-*ej) + K 

\ 3 = 1 3 = 1 J 

(17) 

where 3> c (r • iT J?c ) denotes the sub-matrix comprised of the 
first Kj^ c rows of <L C , while denotes the corresponding 
AWGN matrix at the UL receiver. 

Then the BS becomes capable of estimating the channel of 
its center users as 

Hi,* =f=Y^(v :K ite ) 
s/Pv 

= H 5.i + E H i.i(c:^)+^ ( 18 ) 

j/* 

where (r : K i: fj which can be reduced to an 

arbitrarily small value by increasing M, and (c : K ^ c ) 
denotes the matrix comprised of the first K ^ c columns of 
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H \ j. Note that if we have K^ z > Kj jC , then K i:C — Kj :C 
zero vectors are used to fill H^(c : 7Q ?C ). In contrast to the 
conventional scheme of (0]), the proposed scheme only allows 
the center users to reuse the same pilot group of <I> C , since the 
PC imposed on these center users is modest. Consequently, 
the severity of the PC inflicted upon the channel estimation of 
the center users given in ([T8l ) is minor. 

On the other hand, by adopting the proposed SPR scheme, 
the BS of the i-th cell becomes capable of acquiring the 
channel estimate of its edge users without excessive PC, 
yielding 

H 1 = -Ly+ N*, (19) 

V Pv 

where we have which can be made 

1 y/Pv 1 e ’ z 

arbitrarily small by increasing the number of antennas at the 
BS. It is clear that the PC is completely eliminated for these 
edge users and, therefore, the channel estimation accuracy of 
these edge users is significantly enhanced. By contrast, with 
the conventional scheme, these edge users suffer from grave 
PC, and hence their channel estimates have extremely poor 
quality, which severely limits the achievable UL detection 
performance. With the aid of the proposed SPR scheme, the 
full channel estimate at the BS of the i-th cell is then given 
by 

H M =[%Hy, (20) 


which is significantly more accurate than that of the conven¬ 
tional channel estimation scheme of ©. Thus, given this more 
accurate channel estimate, the UL achievable rate of the edge 
users can be significantly increased, which will be analyzed 
in detail in Section [VT] 

Moreover, since the edge users of the adjacent cells rely on 
orthogonal pilot sequences, a BS can also acquire the partial 
inter-cell channels for the edge users of the adjacent cells. 
Specifically, by correlating the received pilot matrix with 
3>e,j, the BS of the i- th cell becomes capable of acquiring the 
partial inter-cell channels from the edge users in the j-th cell 
without PC, as follows: 


H e • — — H e • 4- N e • i 

V^P 


( 21 ) 


where N^- = can be rendered arbitrarily small 

upon increasing M. Thus, the BS of the i-th cell becomes 
capable of accurately estimating all the partial inter-cell chan¬ 
nels of the links spanning from the edge users of the adjacent 
cells, which comprises an estimate of the inter-cell channel 
matrix G C^ Ke ~ Ki ^ xM as 

Ai = [H?,J • • • H?,*-! • • • Hy] T . (22) 


neighbouring edge users roaming in the adjacent cells, which 
is the topic of the next subsection. 


C. Multi-Cell Block Diagonalization Precoding 

By selecting the TPC vector for a specific user from the 
null space spanned by the channels of other users, the classical 
BD TPC scheme E3 adopted in single-cell multi-user MIMO 
systems is capable of eliminating the multi-user interference. 
Armed with the estimate of the partial inter-cell channels, we 
propose the MBD TPC by extending the classical BD TPC 
to a multi-cell multi-user scenario. Specifically, by projecting 
the DL transmit signal onto the null space of the subspace 
spanned by the inter-cell channels, the proposed MBD TPC 
becomes capable of eliminating the ICI imposed on these edge 
users. 

In order to obtain the null space of the inter-cell channels, 
we first apply the classic SVD l22l to the inter-cell channel 
matrix A i9 yielding 

A, = U^Vf, (23) 

where U* G C^ Ks ~ Ki U denotes the left-singular- 
vector matrix, V* G C MxM denotes the right-singular-vector 
matrix, and 5^ G C^ Ke ~ Ki ^ xM is comprised of the singular 
values as 


Si = 


^ (-K"e ,e T i ) X V % 


^nx(M-n) 

0 (K e -K i>e -r i )x(M-r i ) 


(24) 


in which = rank(A^) is the rank of Ai, Si = 
diagjoyi, 0 -^ 2 , • • • , <Ji , ri } and the singular values satisfy 


<Ji ? l > (Ji : 2 > ■ • ■ > 0’i,r i > 0. (25) 


According to the properties of full SVD, the null space of the 
inter-cell channels, namely, Null (A;) C C M , can be spanned 
by the columns of the matrix G C Mx ^ M_r ^, which is a 
sub-matrix of V* defined by 

B; = [v;, ri+ i V;, ri+2 • • • V;, M ] , (26) 

where Vij denotes the j -th column of Vi. Note that the 
existence of this null space is guaranteed owing to the fact 
that the number of antennas at the BS of LS-MIMO systems 
is much larger than that of the edge users, i.e., we have 


M> K e > K Q -K^>n. (27) 

The large null space of the inter-cell channels indicates that 
for any transmit precoding matrix chosen from this null space, 
i.e., for VWi c Null(A^), we have 

A;W; = 0 => (Hy T W; = 0, Vj ^ i, (28) 


For instance, in the simple example depicted in Fig. 01 the BS 
in the 1st cell is able to acquire the accurate channel estimates 
of both its edge user as well as of the partial inter-cell channels 
of the edge users in two adjacent cells. The inter-cell channel 
matrix A^ provides important information for the DL transmit 
precoding design. Armed with its accurate estimate Athe 
BS of the i- th cell will be able to beneficially preprocess its 
transmissions for the sake of reducing the ICI inflicted upon its 


which means that this transmit precoding matrix calculated 
for the i-th cell is capable of eliminating the ICI inflicted 
upon the edge users roaming in the adjacent cells. It is 
plausible however that a precoding matrix, which is randomly 
chosen from the null space Null(A^) may cause severe intra¬ 
cell interference. To avoid the deleterious effects of intra¬ 
cell interference, we project a conventional transmit precoding 
matrix onto this null space. 
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Smart Multi-cell 
Precoding 


Edge Users in 
Adjacent Cells 


Fig. 5. An illustrative example of the MBD precoding scheme. 


For example, by projecting this conventional MF precoding 
matrix Wf 117 = onto the null space Null(Ai), we 

can generate the MF based MBD matrix as 

W MFMBD = 1 P B H* (29) 

1 ^/^MFMBD 1 V v 7 


where Pb* = denotes the projection operator based on 

the matrix B*, and ^/¥ FMBD is a normalization factor given by 


y MFMBD = -A Tr (H J- P P b,H * ( ) = -^Tr(H^P B ,H* 


(30) 


in which Pg. = Pb^ and Pb^Pb; = Pb* are applied. 

Similarly, by projecting the conventional ZF transmit pre¬ 
coding matrix onto the null space Null(A^), we can generate 
the ZF based MBD matrix as 


^yZFMBD 


((PbAO 


,,) T V 


^ZFMBD 

LsB P B.H.-. (HFPg.PB.H-,) 


-1 


^zmm 

1 


> Bi H*, (h^PbA*,) \ (31) 


where the normalization factor ^/p MBD i s calculated as 
7 ?™ =^-Tr(H^Pg i P Bl H : ji )- 1 


= —Tr(H^P Bi H*,) 


-1 


(32) 


Both the precoding matrices W^ FMBD and Wp MBD are 
capable of eliminating the ICI imposed on the edge users 
of the adjacent cells. An illustrative example is depicted in 
Fig. 0 where based on the proposed SPR scheme, the BS 
becomes capable of estimating the partial inter-cell channels 
of the edge users roaming in the adjacent cells. The MBD TPC 
then projects the DL transmission signal onto the null space 
of the partial inter-cell channels in order to eliminate the ICI 
contaminating the reception of these edge users in the adjacent 
cells. Therefore, the MBD TPC significantly increases the DL 
achievable rate of edge users, and consequently the QoS of 
edge users is considerably enhanced. 


VI. Performance Analysis 

Before we investigate the performance of the proposed 
SPR and MBD schemes, we first discuss the amount of pilot 


resources required, then derive both the UL as well as the 
DL achievable rate, and finally consider the computational 
complexity imposed. 

A. Pilot Resource Consumption 

As seen in CD, the conventional scheme requires K c s = 
max{FQ,l < i < L} number of orthogonal pilot sequences 
and it suffers from grave PC. Again, in order to eliminate the 
PC caused by the reuse of the same pilot group in adjacent 
cells, the most plausible solution is to apply orthogonal pilot 
sequences to all users in all cells. However, the number of 
orthogonal pilot sequences would be increased to 

L 

K os = Y Ki ’ ( 33 ) 

2=1 

which leads to a substantial spectral efficiency reduction. 

Recall that the proposed SPR and MBD schemes are capable 
of enhancing the QoS for edge users at the expense of a 
slightly increased number of pilot resources. More specifically, 
by comparing m and o, the additional pilot resources 
required by the proposed SPR scheme can be derived as 

L 

K sm -K cs = J2 K ifi < Y R \o, (34) 

i^io 2=1 

where io denotes the index of the cell which has the most 
users, i.e., FQ 0 = Kqs- It is clear that the additional number of 
pilot sequences is close to the total number of the edge users. 
Since the edge users are classified according to the threshold 
Pi, which can be adjusted by the parameter A, the number of 
edge users can be flexibly adjusted too. 

More explicitly, the careful choice of the parameter A pro¬ 
vides a flexible trade-off between the pilot resources required 
and the achievable system performance of the proposed SPR 
scheme. At one extreme end, when the parameter A is set 
to 0, all the users will be regarded as edge users and the 
proposed SPR scheme becomes equivalent to the orthogonal 
scheme, where all users in all cells use orthogonal pilot 
sequences. Naturally, this achieves the best performance but 
relies on the most pilot resources, requiring Kos orthogonal 
pilot sequences. At the other extreme, when the parameter A 
is set to a sufficiently large value, e.g. A = Kqs, then all users 
are regarded as center users and the proposed SPR scheme 
degrades to the conventional scheme that reuses the same pilot 
group in all cells. Hence the resultant arrangement attains the 
worst performance but consumes the minimum pilot resources, 
hence requiring only Kqs orthogonal pilot sequences. 

B. Uplink Transmission 

For the center users, the average SINR performance of SPR- 
aided UL transmission becomes almost the same as that of 
applying the conventional scheme. This is because for the 
center users in a cell, the estimated channel matrix of © 
obtained by applying the conventional scheme is very similar 
to that of (fl8l) obtained by applying the SPR scheme. However, 
the achievable rate of the center users of the SPR-aided 
UL transmission is slightly reduced, since the pilot overhead 













increases, i.e., /i —>> On the other hand, the performance 

of UL transmission for the edge users is much more complex, 
as seen below. 

Similar to the received signal given in section [Till based on 
the conventional scheme, the received signal at the BS of the 
i-th cell based on the SPR scheme can be represented as 


L 



3=1 


where x^ ,c = [xjfi x*^ 2 • • ■ x™’ c K . J T denotes the symbol 
vector transmitted from the Kj, c center users in the j-th cell, 
x^ ,e = \x^’\ x U j 2 * * * x Yk s ] T * s symbol vector transmitted 
from the Kj^ edge users in the j-th cell, and denotes the 
corresponding UL AWGN vector. 

By adopting the MF detector based on the channel estima¬ 
tion obtained by the SPR scheme for the edge users in the i -th 
cell, i.e., H® • of (IT9b . the detected symbol vector for the FQ ?e 
users in the i-th cell is given by 


= (HU 


H _ u 

y? 


(H? 


■N, 




v^E( H E x r+ H E x r)+^ 

3 = 1 


+ 77“’ e » MVftD'if, (36) 

where D?* = diag{/3? i(1 ,0? i)2 , • • • denotes 

the sub-diagonal matrix of consisting of the FQ je 

edge users’ large-scale fading coefficients, and r] J 1 ’ 6 = 

[rjfl rf* 2 • • -vIk ] denotes the interference which can be 
made arbitrarily small by increasing the number of antennas 
at the BS. In particular, for the k -th edge user in the i-th cell, 
the detected symbol is given by 


ss? =sfp~AKi,y ^ 


u,e , u,e , u,e 

' Pi,k ' >i,k 


(a) 




,k i 


(37) 


where ^ ( h h,k) H h li,k' x Tk' is the in tra-ceH 

interference arriving from the other edge users in the same 
cell, which can be rendered arbitrarily small by increasing M. 
Similar to the derivation in ©, the UL SINR of the k -th edge 
user in the i-th cell can be calculated as 


SINR 


u,e 

i,k = Pu 



(38) 


and the achievable UL rate can be calculated as C*, = (1 — 
log 2 (l + SINR^fc)}. Note that unlike the result of 
the conventional scheme given in ©, SINR^ increases as M 
increases, and in the asymptotic case of M oc, we have 

s!nr “;l -> oo. 

In summary, in contrast to the conventional scheme which is 
unable to remove the PC by simply increasing the number of 
antennas at the BS M, for the edge users we eliminated the PC 
imposed on the UL data transmission and consequently the UL 
achievable rate is significantly improved. Similar results can 
be obtained if we adopt the ZF detector for UL transmission in 


our proposal, and the ZF detector outperforms the MF detector 
which will be verified by our numerical results. 


C. Downlink Transmission 


Similar to the analysis of the UL transmission, in this 
section we focus our attention on the DL transmission of the 
edge users in our proposal. 

By adopting the MFMBD precoding matrix as derived in 
(l29lh the received signal vector of the FQ ?e edge users in the 
i-th cell can be represented as 


yf =v^E( H y Tw 

3 = 1 
L 

=v^E 


MFMBD 

3 


x d ’ c 

0 

x d ’ e 

X J 


+ n d ’ e 


1 


;=i \ t 


MFMBD 


(Hj .;) 7 B/BjH* ; 


x d ’ c 

0 

x d ’ e 

x j 


Pd 


,MFMBD 




4° 

d,e 


+ If 1,6 

~ LL l ’ 


+ n d ’ e 


(39) 


where x d,c = \x^\ x d, 2 • • • , X ^ ,C K . c ] T are the symbols trans¬ 
mitted to the Kj^ c center users in the j-th cell, x d,e = 
[; x d ’^ x d, 2 • • •x.’^. e ] T are the symbols destined for the Kj jC 

edge users in the j-th cell, n d,e = [n d ’^ n d 2 •• • n d ’^. e ] T 
denotes the corresponding DL AWGN vector, and the approx¬ 
imation « holds as we apply • and (H^-) T B j = 0 

for j ^ i (see (l28l)). 

Thus, for the k -th edge user in the i-th cell, the received 
symbol can be represented as 


—d,e 

Vik 


Pd 


..MFMBD 


( h Uk) TB i B l( h h,k) 


d,Q 

L i,k 




—d,e 
' n i,ki 

(40) 


where the intra-cell interference /xf k is given in (1411 ) . Both 
nf k and /if k can be made arbitrarily small by increasing the 
number of antennas at the BS. The DL SINR of the k -th edge 
user in the i-th cell can then be calculated as 


-d,e 


SINR,;, 


Pd 


(h^, fe ) T B i B|(h| ) . fc ) 


* |2 


..MFMBD 


K 


d,e I 

k I 


I ——d,e 12 


(42) 


n. 


i k | 


and the achievable DL rate can be calculated as C i k = (1 — 
^g-/i)E{ log 2 (l + SINR^,) }. In contrast to the result of the 

conventional scheme given in ©, SINR^ increases as M 
increases, and in the asymptotic case of M —» oo, hence we 
have SINfE ->■ oo. 

It is clear that for the edge users the PC is eliminated by 
our proposed scheme during the DL data transmission, and 
additionally, both the ICI as well as intra-cell interference 
imposed on these edge users has been reduced by the MBD 
scheme. Similar results can be obtained if we adopt the ZF 
based MBD TPC matrix, i.e., Wp MBD , for DL transmission 
in our proposal. 

Taking into account the extra pilot resource, the achievable 
DL rate of the edge users has been significantly improved, 
while that of the center users is slightly reduced. Moreover, 
the average UL and DL cell throughput of the SPR and MBD 
assisted system will be confirmed later by our simulation study. 
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B M(Ki, k ')* x T k ' + E (h|,i, fc ) T BiBt(h« . fc ,) 




i,k 


(41) 


TABLE I 

Basic Simulation Parameters 


Number of cells L tota 1 

19 

Number of antennas in BS M 

32 < M < 256 

Number of users in the i-th cell Kj 

8 < Ki < 10 

Number of pilot resource Kqs 

K cs = 10 

Threshold pi adjustment parameter A 

0.05 < A < 1 

Cell radius R 

500 m 

Average transmit power at users p p ,p u 

10 dBm 

Average transmit power at BS p& 

12 dBm 

Path loss exponent a 

3 

Log normal shadowing fading cr s hadow 

8 dB 

Carrier frequency 

2 GHz 

System bandwidth 

10 MHz 

Thermal noise density 

-174 dBm/Hz 

Pilot overhead parameter p 

p, = 0.1 

Minimum distance between user and BS 

30 m 


D. Computational Complexity 

The computational complexity of implementing the MBD 
scheme at the BS for the edge users will be quantified in terms 
of the number of complex-valued multiplications required, 
which includes the following two main contributions: 

1) For the SVD operator, the complexity is on the order of 

MK%, which is denoted by which allows 

us to calculate A i = Uby using the QR 
decomposition. 

2) For the matrix pseudo inverse operation, the complexity 
is on the order of 0(M^ S ), which alloes us to 
generate the ZF based MBD precoding matrix by using 
the Gram-Schmidt algorithm. 

The total computational complexity of implementing the 
MBD scheme at the BS is therefore on the order of 
0(M(iTg + ^cs))’ which is comparable to that of the con¬ 
ventional scheme and it is within the computational capability 
of a typical state-of-the-art BS. 


VII. Simulation Study 


We evaluated the performance of the proposed SPR and 
MBD schemes using a set of Monte-Carlo simulations. A typ¬ 
ical hexagonal cellular network of L t0ta i cells was considered, 
where the BS of each cell employed M AEs and the i-th cell 
had Ki single-AE users O, (251 , (26l . The default values 
of the various parameters of this simulated hexagonal cellular 
network are summarized in Table U The large-scale fading 
coefficient was generated according to lf2l 




%i,j,k 

( r i,j,k/R) a 


(43) 


where R denotes the cell radius, and a is the path loss 
exponent, while r^^k is the distance between the k -th user 
in the j-th cell and the BS in the i-th cell, while z^^k denotes 
the shadow fading factor, which obeys the log-normal distri¬ 
bution, i.e., 101og 10 (zij^k) follows the zero-mean Gaussian 



Fig. 6. An instantiation of the randomly generated user distribution in the 
simulated hexagonal cellular network, where red crosses, green dots, and black 
numbers denote center users, edge users, and cell numbers, respectively. 



Fig. 7. Channel estimation accuracy comparison for the conventional and 
proposed SPR schemes with M = 128. 


distribution having a standard deviation of cr S h a dow The reuse 
factor of center pilot group <F C is 1, i.e., it is reused in all 
L to tai cells, while the reuse factor of edge sub pilot groups 
3>e = [&I : i ’' * ^ 7 ] is 7, i.e., the i- th cell utilizes 

3>e,mod(z,7)+i and non adjacent cells reuse the same edge sub 
pilot group. The locations of the users in each cell were all 
randomly generated in each trial. A particular simulation trial 
is shown in Fig.[6l where the red crosses and green dots in each 
cell denote the center users and edge users, respectively, which 
are classified by the BS based on the threshold pi associated 
with the parameter A = 0.1. As mentioned previously and also 
seen from Fig. [6l the classification of center users and edge 
users is not based on their distance from the serving BS. 

Fig. [7] compares the channel estimation accuracies as func¬ 
tions of the grouping parameter A for both the conventional 
and the proposed SPR schemes with M = 128. In each 
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A 

The UL SINR of edge users is 
improved by the SPR scheme. 


-10 -5 0 5 

UL SINR (dB) 

Fig. 8. The CDF of UL SINR for the conventional and proposed SPR 
schemes with M = 128 and A = 0.1. 


(£JL rate of center users 
decreases due to the 
increased pilot overhead 
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■e— MF & Center Users & Conventional Scheme 
O ZF & Center Users & Conventional Scheme 
-0— MF & Center Users & Proposed SPR Scheme 
0 ZF & Center Users & Proposed SPR Scheme 
A— MF & Edge Users & Conventional Scheme 
A ZF & Edge Users & Conventional Scheme 
k — MF & Edge Users & Proposed SPR Scheme 
■jV ZF & Edge Users & Proposed SPR Scheme 


3 4 5 6 7 

User UL achievable rate (bps/Hz) 

Fig. 9. The CDF of UL achievable rate for the conventional and proposed 
SPR schemes with M = 128 and A = 0.1. 


simulation trial, the channel estimation mean square error 
(MSE) of the edge users is calculated as 


MSE e = E 


L i ,e 

EE' 

i= 1 k= 1 


h e — h e 
I n i,i,k n i,i,k I 




(44) 


where k denotes the estimate of the true channel vector 
hf i k , while the MSE of the channel estimation for the center 
users is defined as 


MSE C = Ei - - r - V V 

i(£f=i 


h c — h c 
I ll i,i,k lL i,i,k\ 


i,i,k 


(45) 


where • k denotes the estimate of the true channel vector 
k . The average results over 100 random simulation runs 
are presented in Fig. [71 By increasing the grouping parameter 
A, more users will be regarded as edge users. As expected, 
for the center users, who only suffer from a slight PC, both 
the conventional and proposed SPR schemes attain the same 
excellent channel estimation accuracy. However, the conven¬ 
tional scheme attains a poor channel estimation accuracy for 
the edge users, who suffer from severe pilot contamination. By 
contrast, since the PC is eliminated by applying orthogonal 
pilot sequences for the edge users in the adjacent cells, the 
channel estimation accuracy achieved by the proposed SPR 
scheme is significantly improved. 

Fig. [8] shows the cumulative density function (CDF) of UL 
SINR for both the conventional and for the proposed SPR 
schemes with M = 128 and A = 0.1, where the results are 
presented by 1000 random simulation trials. In each simulation 
run, the conventional scheme calculates the UL SINR of a 
center or an edge user according to the first line of ©. For 
our SPR scheme, the UL SINR of center users is similar to 
that for the conventional scheme, while the UL SINR of edge 
users is calculated by the first line of (l36l) in each simulation 
trial. Since the UL transmission of the proposed SPR scheme 
is almost the same as that of the conventional scheme for 
the center users, their curves in Fig. [8] are almost coincided. 
Observed in Fig. [U our SPR scheme attains a significantly 
higher UL SINR for the edge users than the conventional 



Similar UL cell throughput can t 
achieved when X is properly selected 


UL cell throughput of proposed SPR ^f\ 
scheme reduces when A is large 
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Fig. 10. The average UL cell throughput for the conventional and proposed 
SPR schemes with M = 128 against A. 


scheme. Furthermore, ZF detector is always better than MF 
detector by about 2 dB for both center users and edge users. 

Fig. E shows the CDF of UL achievable rate for the 
conventional and proposed SPR schemes with M — 128 and 
A = 0.1, where the results are obtained from 500 random 
simulation runs. In each simulation trial, we generate the user 
positions first and then generate the channels of users for 
50 times to obtain the UL achievable rate. Although the UL 
SINR results of the center users of the conventional and SPR 
schemes are the same as shown in Fig. [8] the UL achievable 
rates are different due to the different pilot overhead, i.e., 
fi -k /i. It is clear that the UL achievable rate of edge users 
is significantly improved by the SPR scheme, while the UL 
achievable rate of center users decreases due to the increased 
pilot overhead. Moreover, the ZF detector always outperforms 
the MF detector by about 0.3 bps/Hz per user. 

Fig. [10| shows the average UL cell throughput for the 
conventional and proposed SPR schemes with M = 128. It 
is clear that by increasing the group parameter A, there will 
be more users regarded as edge users, which leads to the 
increase of pilot overhead and the decrease of UL achievable 
rate of center users. Thus, the proper selection of grouping 
parameter is important, e.g., A < 0.2, which both improves 
the performance of edge users and also ensures the cell 
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Fig. 11* The average UL cell throughput for the conventional and proposed 
SPR schemes with A = 0.1 against M. 

throughput. Otherwise, e.g., A > 0.5, it is clear that the loss 
caused by over large pilot overhead outweights the gain of the 
SPR scheme. In addition, the ZF detector always provides a 
gain about 5 bps/Hz of average cell throughput compared with 
the MF detector. 

Fig. |TT] shows the average UL cell throughput comparison 
of the conventional and proposed SPR schemes with A = 0.1 
against the number of BS antennas M. When the number 
of BS antennas is small, i.e., M = 32, the average UL 
cell throughput of the proposed SPR scheme is smaller than 
that of the conventional scheme about 5 bps/Hz with MF 
detector adopted. It is obvious that, to obtain the performance 
gain for edge users as shown in Fig. [9] the proposed SPR 
scheme scarifies the spectral efficiency due to the increased 
pilot overhead and leads to the average UL cell throughput 
reduction. However, by increasing the number of BS antennas, 
e.g., M = 256, it becomes clear that the average UL cell 
throughput of the proposed SPR scheme approaches that of 
the conventional scheme since the performance of edge users 
can be significantly improved by increasing M. Moreover, the 
ZF detector outperforms the MF detector a lot when M is 
small, and the gap shrinks as M increasing. 

Fig. [12] shows the CDF of DL achievable rate for the 
conventional system, the SPR aided system as well as the SPR 
and MBD assisted system with M = 128 and A = 0.1. Despite 
the MBD precoding scheme, it is clear that the results of DL 
achievable rate are similar with that of UL achievable rate as 
shown in Fig. [9] due to their duality property. When the MBD 
precoding is considered, we find that the DL achievable rate of 
edge users can be significantly improved due to the elimination 
of the ICI, while the DL achievable rate of center users 
slightly decreases since the projecting operator of the MBD 
precoding sacrifices degrees of freedom of the DL signals for 
center users. Again, we can find that the ZFMBD precoding 
achieves a gain about 0.4 bps/Hz compared with the MFMBD 
precoding for edge users. 

Fig. [13] shows the average DL cell throughput for the 
conventional system, the SPR aided system as well as the 
SPR and MBD assisted system with A = 0.1 against M. 
The conventional system outperforms the SPR aided system, 
while the SPR and MBD assisted system performs worst when 



Number of BS antennas M 


Fig. 13. The average DL cell throughput for the conventional system, the 
SPR aided system as well as the SPR and MBD assisted system with A = 0.1 
against M. 

small number of BS antennas is considered, e.g., M = 32. 
However, considering the typical massive MIMO configuration 
as M = 256, it is clear that the average DL cell throughput 
of the SPR aided system approaches that of the conventional 
scheme, and the SPR and MBD assisted system performs 
best of all three. Moreover, when M is further increased, the 
performance gap between the SPR and MBD assisted system 
and the conventional system will also become larger, which 
means that the increased rate of edge users becomes larger 
than the decreased rate of center users. 

VIII. Conclusions 

We have developed a soft pilot reuse and multi-cell block di- 
agonalization precoding regime for LS-MIMO systems, which 
are capable of significantly enhancing both the achievable UL 
and DL rate for edge users. Our contribution is twofold. Firstly, 
we break away from the traditional practice of treating all users 
as though they suffer from the same level of PC, and propose a 
simple yet effective means of dividing the users into cell-center 
and cell-edge users. This grouping allows us to apply the 
proposed SPR scheme, whereby a center pilot group is reused 
for the center users in all cells, while the edge pilot group is 
applied to the edge users in the adjacent cells. By requiring 
a slightly increased number of pilot sequences, the proposed 
SPR scheme eliminates the pilot contamination inflicted upon 
the edge users who would otherwise suffer from severe PC in 
the conventional scheme. This significantly enhances the QoS 
for the edge users, meanwhile ensures both the average UL 
and DL cell throughput with slight and negligible reduction 
compared with that of the conventional system. Secondly, 
we further exploit the fact that the BS becomes capable of 
estimating the inter-cell channels of the edge users in the 
adjacent cells with the aid of the our SPR regime without 
the deliterious effects of PC. Finally, we extend the classical 
BD precoding to a multi-cell scenario and propose the MBD 
precoding to eliminate the ICI imposed on the edge users of 
the adjacent cells in the DL. This MBD precoding further 
enhanced the performance of edge users in DL transmission 
and improved the average DL cell throughput in addition to 
the gain obtained by the SPR scheme. 
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Fig. 12. The CDF of DL achievable rate for the conventional system, the SPR aided system as well as the SPR and MBD assisted system with M = 128 
and A = 0.1. 
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